16 research outputs found

    Towards using web-crawled data for domain adaptation in statistical machine translation

    Get PDF
    This paper reports on the ongoing work focused on domain adaptation of statistical machine translation using domain-specific data obtained by domain-focused web crawling. We present a strategy for crawling monolingual and parallel data and their exploitation for testing, language modelling, and system tuning in a phrase--based machine translation framework. The proposed approach is evaluated on the domains of Natural Environment and Labour Legislation and two language pairs: English–French and English–Greek

    Introducing the Digital Language Equality Metric: Technological Factors

    Get PDF
    This paper introduces the concept of Digital Language Equality (DLE) developed by the EU-funded European Language Equality (ELE) project, and describes the associated DLE Metric with a focus on its technological factors (TFs), which are complemented by situational contextual factors. This work aims at objectively describing the level of technological support of all European languages and lays the foundation to implement a large-scale EU-wide programme to ensure that these languages can continue to exist and prosper in the digital age, to serve the present and future needs of their speakers. The paper situates this ongoing work with a strong European focus in the broader context of related efforts, and explains how the DLE Metric can help track the progress towards DLE for all languages of Europe, focusing in particular on the role played by the TFs. These are derived from the European Language Grid (ELG) Catalogue, that provides the empirical basis to measure the level of digital readiness of all European languages. The DLE Metric scores can be consulted through an online interactive dashboard to show the level of technological support of each European language and track the overall progress toward DLE

    Introducing the digital language equality metric: technological factors

    Get PDF
    This paper introduces the concept of Digital Language Equality (DLE) developed by the EU-funded European Language Equality (ELE) project, and describes the associated DLE Metric with a focus on its technological factors (TFs), which are complemented by situational contextual factors. This work aims at objectively describing the level of technological support of all European languages and lays the foundation to implement a large-scale EU-wide programme to ensure that these languages can continue to exist and prosper in the digital age, to serve the present and future needs of their speakers. The paper situates this ongoing work with a strong European focus in the broader context of related efforts, and explains how the DLE Metric can help track the progress towards DLE for all languages of Europe, focusing in particular on the role played by the TFs. These are derived from the European Language Grid (ELG) Catalogue, that provides the empirical basis to measure the level of digital readiness of all European languages. The DLE Metric scores can be consulted through an online interactive dashboard to show the level of technological support of each European language and track the overall progress toward DLE

    Digital language equality: definition, metric, dashboard

    Get PDF
    This chapter presents the concept of Digital Language Equality (DLE) that was at the heart of the European Language Equality (ELE) initiative, and describes the DLE Metric, which includes technological factors (TFs) and contextual factors (CFs): the former concern the availability of Language Resources and Technologies (LRTs) for the languages of Europe, based on the data included in the European Language Grid (ELG) catalogue, while the latter reflect the broader socio-economic contexts and ecosystems of the languages, as these determine the potential for LRT development. The chapter discusses related work, presents the DLE definition and describes how it was implemented through the DLE Metric, explaining how the TFs and CFs were quantified. The resulting scores of the DLE Metric for Europe’s languages can be visualised and compared through the interactive DLE dashboard, to monitor the progress towards DLE in Europe

    The European language equality project: enabling digital language equality for all European languages by 2030

    Get PDF
    The EU project European Language Equality is currently preparing a strategic research, innovation and deployment agenda and roadmap which will provide a detailed plan and strategic recommendations on how to achieve digital language equality in Europe by 2030. This article presents an overview of the project, our definition of digital language equality and preliminary results using the associated DLE metric. The final project documentation including the strategic agenda will be handed over to representatives of the European Union in mid-2022

    Results of the forward-looking community-wide consultation

    Get PDF
    Within the ELE project three complementary online surveys were designed and implemented to consult the Language Technology (LT) community with regard to the current state of play and the future situation in about 2030 in terms of Digital Language Equality (DLE). While Chapters 4 and 38 provide a general overview of the community consultation methodology and the results with regard to the current situation as of 2022, this chapter summarises the results concerning the future situation in 2030. All of these results have been taken into account for the specification of the project’s Strategic Research, Innovation and Implementation Agenda (SRIA) and Roadmap for Achieving Full DLE in Europe by 2030.

    Nutrient Recovery Technologies from Waste Flows

    No full text
    Εθνικό Μετσόβιο Πολυτεχνείο--Μεταπτυχιακή Εργασία. Διεπιστημονικό-Διατμηματικό Πρόγραμμα Μεταπτυχιακών Σπουδών (Δ.Π.Μ.Σ.) “Περιβάλλον και Ανάπτυξη

    Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’ written narratives

    No full text
    In line with cross-linguistic research aiming at identifying criterial features that discriminate the CEFR proficiency levels, the present study investigates language elements that are core characteristics of each proficiency level for Greek L2. It is based on a graded corpus of 150 written narratives produced by young L2 learners (aged 8–14) at levels A2 to B2. This corpus was annotated with respect to a set of features at both the sentence and discourse level, such as clause subordination, connectives, modifiers and grammatical accuracy. Statistical analysis identified certain aspects of these features that discriminate language proficiency levels in L2 Greek narratives and are put forward as criterial features. These include the frequency of dependent and centre-embedded clauses, the gradual decrease of additive and the emergence of contrastive and inferential connectives, the felicitous use of clitics, as well as the use of evaluative adverbs and adjectives

    Discriminating CEFR levels in Greek L2: a corpus-based study of young learners’ written narratives

    No full text
    In line with cross-linguistic research aiming at identifying criterial features that discriminate the CEFR proficiency levels, the present study investigates language elements that are core characteristics of each proficiency level for Greek L2. It is based on a graded corpus of 150 written narratives produced by young L2 learners (aged 8–14) at levels A2 to B2. This corpus was annotated with respect to a set of features at both the sentence and discourse level, such as clause subordination, connectives, modifiers and grammatical accuracy. Statistical analysis identified certain aspects of these features that discriminate language proficiency levels in L2 Greek narratives and are put forward as criterial features. These include the frequency of dependent and centre-embedded clauses, the gradual decrease of additive and the emergence of contrastive and inferential connectives, the felicitous use of clitics, as well as the use of evaluative adverbs and adjectives
    corecore